14 - Programming Techniques for Supercomputers [ID:32668]
clip player preview

Dieser Clip ist ausschließlich für angemeldete Benutzer zugänglich.

Zugänglich über

Nur für Portal

Gesperrt clip

Dauer

01:28:01 Min

Aufnahmedatum

2021-05-11

Hochgeladen am

2021-05-11 19:46:22

Sprache

en-US

This lecture investigates the performance of the Schoenauer Vector triads benchmark over the full memory heirarchy of a single core Intel Haswell processor. Analysing the data transfers throughout the memory hierarchy a performance modell is established which qualitatively describes the performance levels for data sets in different memory hierarchy levels. Further, the dense matrix vector multiplication is investigated to identify performance imporvements by increasing the temporal reuse of vector data. As first optimization strategy outer-loop unroll&jam is identified and successfully tested.